-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Add release notes for meshlet BVH culling #20526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add release notes for meshlet BVH culling #20526
Conversation
The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. | ||
However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time). | ||
|
||
Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. | |
Speaking of GPU cost, the scene above renders in about 3.5 ms on a 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to the 4070 + 1440p combo mentioned above
Co-authored-by: atlv <[email protected]>
Co-authored-by: JMS55 <[email protected]>
|
||
(TODO: Embed example screenshot here) | ||
|
||
Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU. | |
Bevy's virtual geometry has been greatly optimized with BVH-based culling, making the cost of rendering nearly independent of scene geometry. (Then write something here about 120k vs 1 million instances). |
|
||
Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU. | ||
|
||
This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). | |
This also gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). |
There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented), | ||
and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented), | |
and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame. | |
There are now *no* hardcoded limits to scene size. In practice you will only be limited by asset VRAM usage (since streaming is not yet implemented), | |
and total instance count due the current code requiring all instances to be re-uploaded to the GPU every frame. |
and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame. | ||
|
||
The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. | ||
However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this sentence would mean much to people not very familiar with the rendering code.
However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time). | ||
|
||
Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. | ||
After increasing the instance count to over 1 million (almost *900 billion triangles*!), the GPU time increases to about 4.5 ms, with ~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add something like "this is X times the triangle count, but only Y more GPU time"
Co-authored-by: atlv <[email protected]>
Objective
Solution
Testing
Showcase